Skip to content

Conversation

@aniruddh-alt
Copy link
Contributor

@aniruddh-alt aniruddh-alt commented Sep 27, 2025

Description

Add AWS Bedrock Converse API Support

This PR adds support for AWS Bedrock's Converse API as a new inference engine in Oumi.

Key Features Added:

  • New BedrockInferenceEngine - Implements AWS Bedrock Converse API integration
  • Framework Integration - Added BEDROCK to InferenceEngineType enum and registered in the engine builder
  • AWS Credentials Support - Automatic AWS credential handling via boto3 (IAM roles, profiles, environment variables)
  • Concurrent Processing - Async inference with proper concurrency control and rate limiting
  • Error Handling - AWS-specific error handling with retry logic and exponential backoff
  • Parameter Support - Supports key generation parameters: max_new_tokens, temperature, top_p, stop_strings

Implementation Details:

  • Uses boto3 for AWS SDK integration (preferred over raw HTTP for AWS services)
  • Extends RemoteInferenceEngine with Bedrock-specific implementations
  • Overrides batch inference methods with NotImplementedError (Bedrock doesn't support OpenAI-style batch API)
  • Implements required abstract methods: _infer_online(), _infer(), _query_api()
  • Includes comprehensive unit tests for conversation conversion and API response parsing

Usage Example:

from oumi.builders.inference_engines import build_inference_engine
from oumi.core.configs import InferenceEngineType, ModelParams

# Create Bedrock engine
engine = build_inference_engine(
    InferenceEngineType.BEDROCK,
    ModelParams(model_name="anthropic.claude-3-sonnet-20240229-v1:0")
)

# Run inference
results = engine.infer(conversations)

Configuration:

model:
  model_name: "us.anthropic.claude-3-sonnet-20240229-v1:0"

inference_engine:
  engine_type: "BEDROCK"

generation:
  max_new_tokens: 1024
  temperature: 0.7
  top_p: 0.9

Related issues

Fixes #1971

Before submitting

  • This PR only changes documentation. (You can ignore the following checks in that case)
  • Did you read the contributor guideline Pull Request guidelines?
  • Did you link the issue(s) related to this PR in the section above?
  • Did you add / update tests where needed?

Reviewers

At least one review from a member of oumi-ai/oumi-staff is required.

@aniruddh-alt aniruddh-alt marked this pull request as ready for review September 29, 2025 21:12
@aniruddh-alt
Copy link
Contributor Author

I have tested the inference engine with my AWS API key and everything works fine. I have attached a script of a local script I ran for your reference:
image

@oelachqar
Copy link
Contributor

Thank you @aniruddh-alt for the contribution! Reviewing now

@aniruddh-alt
Copy link
Contributor Author

I just realized that there are some issues with my unit tests in terms of importing boto3. I will refer to the Llama cpp engine tests and fix these issues. Apologies for the inconvenience!

@aniruddh-alt
Copy link
Contributor Author

I have fixed my unit tests and also addressed the failing CPU tests with my latest commit.

@aniruddh-alt aniruddh-alt requested a review from oelachqar October 7, 2025 15:58
@oelachqar oelachqar added this pull request to the merge queue Oct 20, 2025
Merged via the queue into oumi-ai:main with commit d264731 Oct 20, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Bedrock ConverseAPI support for inference

2 participants